Adaptive Classifiers, Topic Drifts and GO Annotations
نویسنده
چکیده
Gene annotations with Gene Ontology codes offer scientists important options in their study of genes and their functions. Automatic GO annotation methods have the potential to supplement the intensive manual annotation processes. Annotation approaches using MEDLINE documents are generally two-phased where the first is to annotate documents with GO codes and the second is to annotate gene products via the documents. In this paper we study document annotation with GO codes using a temporal perspective. Specifically, we build adaptive code-specific classifiers. We also study topic drift i.e., changes in the contextual characteristics of annotations over time. We show that topic drift is significant especially in the biological process GO hierarchy. This at least partially explains the particular challenges faced with codes of this hierarchy.
منابع مشابه
Reservoir of Diverse Adaptive Learners and Stacking Fast Hoeffding Drift Detection Methods for Evolving Data Streams
The last decade has seen a surge of interest in adaptive learning algorithms for data stream classification, with applications ranging from predicting ozone level peaks, learning stock market indicators, to detecting computer security violations. In addition, a number of methods have been developed to detect concept drifts in these streams. Consider a scenario where we have a number of classifi...
متن کاملThe use of gene ontology evidence codes in preventing classifier assessment bias
MOTIVATION The biological community's reliance on computational annotations of protein function makes correct assessment of function prediction methods an issue of great importance. The fact that a large fraction of the annotations in current biological databases are based on computational methods can lead to bias in estimating the accuracy of function prediction methods. This can happen since ...
متن کاملFast Adaptive Real-Time Classification for Data Streams with Concept Drift
An important application of Big Data Analytics is the realtime analysis of streaming data. Streaming data imposes unique challenges to data mining algorithms, such as concept drifts, the need to analyse the data on the fly due to unbounded data streams and scalable algorithms due to potentially high throughput of data. Real-time classification algorithms that are adaptive to concept drifts and ...
متن کاملA Comparison on How Statistical Tests Deal with Concept Drifts
RCD is a framework proposed to deal with recurring concept drifts. It stores classifiers together with a sample of data used to train them. If a concept drift occurs, RCD tests all the stored samples with a sample of actual data, trying to verify if this is a new context or an old one that is recurring. This is performed by a non-parametric multivariate statistical test to make the verification...
متن کاملJust-in-Time Adaptive Classifiers - Part II: Designing the Classifier
Aging effects, environmental changes, thermal drifts, and soft and hard faults affect physical systems by changing their nature and behavior over time. To cope with a process evolution adaptive solutions must be envisaged to track its dynamics; in this direction, adaptive classifiers are generally designed by assuming the stationary hypothesis for the process generating the data with very few r...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- AMIA ... Annual Symposium proceedings. AMIA Symposium
دوره شماره
صفحات -
تاریخ انتشار 2007